Skip to content

fix(streamable-http): reject duplicate in-flight request ids#3063

Open
Sammy-Dabbas wants to merge 3 commits into
modelcontextprotocol:mainfrom
Sammy-Dabbas:issue-3060-duplicate-request-ids
Open

fix(streamable-http): reject duplicate in-flight request ids#3063
Sammy-Dabbas wants to merge 3 commits into
modelcontextprotocol:mainfrom
Sammy-Dabbas:issue-3060-duplicate-request-ids

Conversation

@Sammy-Dabbas

Copy link
Copy Markdown

Fixes #3060.

The stateful streamable HTTP transport keys per-request routing by the request id and assigns the slot without an existence check. Two concurrent POSTs sharing an id cross-wire: the second overwrites the first's routing slot, the first request's response is delivered to the second POST, and the first hangs while its stream leaks. Reproduced on main before the fix.

This adds a guard in _handle_post_request that rejects a POST whose request id is already in flight on the session with HTTP 400 and JSON-RPC -32600, placed before the JSON/SSE branch so both response modes are covered. The spec requires request ids to be unique within a session. Sequential reuse of an id after the earlier request completes still works, since deployed clients send a constant id for every request; a regression test pins that behavior.

Tests: the new duplicate-id test fails on unpatched main and passes with the fix. tests/shared/test_streamable_http.py passes 67/67 and the full suite passes locally (5282 passed). A stress run of 500 rapid sequential same-id requests produced no spurious rejections. ruff and pyright are clean on the touched files.

Scope note: this is the transport-level guard only. The dispatcher-level blind overwrite in jsonrpc_dispatcher.py (TODO from #3046) is deliberately left to the follow-up discussed on the issue, since the two guards compose.

Disclosure: this change was developed with AI assistance (Claude Code). I reviewed the change and the test results before submitting.

The transport keys per-request routing by request id and assigned the
slot without checking for an existing entry, so a second concurrent
POST with the same id silently overwrote the first request's routing
slot. One request received the other's payload and the other hung.

Reject a POST whose request id is already in flight on the session
with HTTP 400 and JSON-RPC -32600. Ids may still be reused once the
earlier request completes, which deployed clients rely on.

Fixes modelcontextprotocol#3060.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/mcp/server/streamable_http.py
@Sammy-Dabbas

Copy link
Copy Markdown
Author

Good catch on the priming race, that was a real gap: the guard ran before the awaited store_event, so a same-id POST could slip in during event-store persistence. Fixed by reserving the routing slot synchronously with the guard and minting the priming event afterwards, with the reservation released if the event store raises so the 500 path leaves nothing in flight. Added a regression test that suspends a gated event store mid-priming and asserts the concurrent duplicate is still rejected; it fails by timeout on the previous commit and passes now. Full transport module is green (68 tests).

In resumable SSE mode the priming event is minted by awaiting
EventStore.store_event between the duplicate-id guard and the routing
slot registration, so a concurrent POST reusing the id could pass the
guard during that await and overwrite the slot.

Reserve the slot synchronously with the guard and mint the priming
event afterwards, releasing the reservation if the event store raises
so the outer 500 path leaves nothing in flight. Adds a regression test
that suspends a gated event store mid-priming and asserts the duplicate
POST is still rejected.
@akminx

akminx commented Jul 5, 2026

Copy link
Copy Markdown

Nice — the slot reservation before priming closes the resumable-SSE race cleanly. Two mechanical CI things are what's currently red, both in the newly added test code rather than the fix itself:

1. Coverage gate (the 20 test (...) matrix jobs). coverage report fails with:

tests/shared/test_streamable_http.py   990   1   108   1   99.82%   844
TOTAL                                                              99.99%
Coverage failure: total of 99.99 is less than fail-under=100.00

Line 844 in the new test is uncovered (this repo enforces fail-under = 100 with branch = true, and tests/ is in [tool.coverage.run] source). Worth running uv run --frozen --no-sync coverage run -m pytest -n auto && uv run --frozen --no-sync coverage combine && uv run --frozen --no-sync coverage report locally to see the exact arc — usually it's one branch of the gated-event-store path that isn't taken.

2. pre-commit (pyright). The pre-commit job fails on 6 pyright errors in the same new test, e.g.:

tests/shared/test_streamable_http.py:853:28 - error: Expected mapping for dictionary unpack operator (reportGeneralTypeIssues)
tests/shared/test_streamable_http.py:851:13 - error: Type of "init_request" is partially unknown

Looks like init_request needs a concrete type so the **-unpack resolves.

Once those two are green the change looks solid. For what it's worth, I'm putting up the dispatcher-layer piece we discussed (the jsonrpc_dispatcher.py _in_flight overwrite + the cancellation-targeting case) as a separate PR, scoped exactly as you described — transport guard and dispatcher guard composing.

Mark the test helper's defensive raise no-cover, matching the module
helper's convention, and give the request dicts concrete types so the
dictionary unpacks resolve under pyright. Import the protocol version
header from its defining module.
@Sammy-Dabbas

Copy link
Copy Markdown
Author

Thanks for running those down, both fixed. The uncovered line was the test helper's defensive raise, now marked no-cover matching the module-level first_sse_data helper's convention. The pyright errors came from the untyped dict unpacks; init_request and the tool call payload now have concrete types, and I moved the protocol version header import to its defining module while there. pyright is clean on the file and the coverage gate passes locally apart from pre-existing platform-conditional lines in other test files that only execute on Linux.

Glad the dispatcher piece is up as #3064, and the scoping is exactly right. The two guards compose and neither alone closes the issue, so from my side please keep it open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants